EDISON-WMW: Exact Dynamic Programing Solution of the Wilcoxon–Mann–Whitney Test

نویسندگان

  • Alexander Marx
  • Christina Backes
  • Eckart Meese
  • Hans-Peter Lenhof
  • Andreas Keller
چکیده

In many research disciplines, hypothesis tests are applied to evaluate whether findings are statistically significant or could be explained by chance. The Wilcoxon-Mann-Whitney (WMW) test is among the most popular hypothesis tests in medicine and life science to analyze if two groups of samples are equally distributed. This nonparametric statistical homogeneity test is commonly applied in molecular diagnosis. Generally, the solution of the WMW test takes a high combinatorial effort for large sample cohorts containing a significant number of ties. Hence, P value is frequently approximated by a normal distribution. We developed EDISON-WMW, a new approach to calculate the exact permutation of the two-tailed unpaired WMW test without any corrections required and allowing for ties. The method relies on dynamic programing to solve the combinatorial problem of the WMW test efficiently. Beyond a straightforward implementation of the algorithm, we presented different optimization strategies and developed a parallel solution. Using our program, the exact P value for large cohorts containing more than 1000 samples with ties can be calculated within minutes. We demonstrate the performance of this novel approach on randomly-generated data, benchmark it against 13 other commonly-applied approaches and moreover evaluate molecular biomarkers for lung carcinoma and chronic obstructive pulmonary disease (COPD). We found that approximated P values were generally higher than the exact solution provided by EDISON-WMW. Importantly, the algorithm can also be applied to high-throughput omics datasets, where hundreds or thousands of features are included. To provide easy access to the multi-threaded version of EDISON-WMW, a web-based solution of our algorithm is freely available at http://www.ccb.uni-saarland.de/software/wtest/.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

unifiedWMWqPCR: the unified Wilcoxon-Mann-Whitney test for analyzing RT-qPCR data in R

MOTIVATION Recently, De Neve et al. proposed a modification of the Wilcoxon-Mann-Whitney (WMW) test for assessing differential expression based on RT-qPCR data. Their test, referred to as the unified WMW (uWMW) test, incorporates a robust and intuitive normalization and quantifies the probability that the expression from one treatment group exceeds the expression from another treatment group. H...

متن کامل

Wilcoxon-Mann-Whitney or t-test? On assumptions for hypothesis tests and multiple interpretations of decision rules.

In a mathematical approach to hypothesis tests, we start with a clearly defined set of hypotheses and choose the test with the best properties for those hypotheses. In practice, we often start with less precise hypotheses. For example, often a researcher wants to know which of two groups generally has the larger responses, and either a t-test or a Wilcoxon-Mann-Whitney (WMW) test could be accep...

متن کامل

Exploiting the Link Between the Wilcoxon-Mann-Whitney Test and a Simple Odds Statistic

Over a quarter-century ago, Alan Agresti (Biometrics, 1980) proposed using the generalized odds ratio (genOR) to summarize the association between two ordinal variables. Unfortunately, genOR is still largely unknown, even though it is elegantly straightforward and fills key voids in the working statistician’s toolbox. An extension of it, a statistic we are calling “WMWodds,” is an ideal effect-...

متن کامل

A Bootstrap Test of Stochastic Equality of Two Populations

When comparing two variables with nonnormal distributions, application of the Wilcoxon-Mann-Whitney test (WMW) is a common choice. However, it is only valid to test the null hypothesis stating equality of the distributions. Sometimes the hypothesis of interest is H0 : P (X < Y ) = P (X > Y ) against P (X < Y ) / = P (X > Y ), called stochastic equality and inequality. Here we propose a bootstra...

متن کامل

t-tests, non-parametric tests, and large studies—a paradox of statistical practice?

BACKGROUND During the last 30 years, the median sample size of research studies published in high-impact medical journals has increased manyfold, while the use of non-parametric tests has increased at the expense of t-tests. This paper explores this paradoxical practice and illustrates its consequences. METHODS A simulation study is used to compare the rejection rates of the Wilcoxon-Mann-Whi...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره 14  شماره 

صفحات  -

تاریخ انتشار 2016